UAM@SOCO 2014: Detection of Source Code Re-use by means of Combining Different Types of Representations

نویسندگان

  • A. Ramírez-de-la-Cruz
  • G. Ramírez-de-la-Rosa
  • C. Sánchez-Sánchez
  • W. A. Luna-Ramírez
  • H. Jiménez-Salazar
  • C. Rodríguez-Lucatero
چکیده

This paper describes the participation of the Language and Reasoning group from UAM-C in the context of the SOurce COde re-use competition (SOCO 2014). We propose different representations of a source code, which attempt to highlight different aspects of a code; particularly: i) lexical, ii) structural, and iii) stylistics. From the lexical view, we used a character 3-gram model without considering all reserved words for the programming language in revision. For the structural view, we proposed two similarity metrics that takes into account the function’s signatures within a source code, namely the data types and the identifier’s names of the function’s signature. The third view consists on accounting for several stylistics’ features, such as the number of white spaces, lines of code, upper letters, etc. At the end, we combine these different representations in three ways, each of which was a run submission for the SOCO competition this year. Obtained results indicate that proposed representations provide some information that allows to detect particular cases of source code re-use.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High Level Features for Detecting Source Code Plagiarism across Programming Languages

In this paper we describe the participation of the Language and Reasoning group from UAM-C in the context of the Cross Language SOurce COde re-use competition (CL-SOCO 2015). We proposed a representation of source code pairs by using five high level features; namely: i) lexical feature, ii) stylistic feature, iii) comments feature, iv) programmer’s text feature, and v) structure feature. We com...

متن کامل

PAN@FIRE: Overview of CL-SOCO Track on the Detection of Cross-Language SOurce COde Re-use

The detection of source code re-use is an important research field for both software industry and academia fields. This paper summarizes the goals, organization and results of the second SOCO competitive evaluation campaign for systems that automatically detect the source code re-use phenomenon. PAN@FIRE shared task, named Cross-Language SOurce COde Re-use (CL-SOCO), focused on the detection of...

متن کامل

The effect of source shield on landmine detection

Background: Several landmine detection methods, based on nuclear techniques, have been suggested during the recent years. Neutron energy moderation, neutron-induced gamma emission, neutron and gamma attenuation, and fast neutron backscattering are nuclear-based methods used for landmine detection. The aim of this study is to use backscattered neutron for landmine detection. Materials ...

متن کامل

Investigating the capability of plastic scintillation detectors in design of a muon radiography system by Geant4 code

Imaging and identifying materials with high atomic numbers and densities, especially radioactive materials, is one of the issues that have been especially considered in recent years. Due to some limitations in conventional and old imaging techniques, finding an alternative method is very important. The cosmic muons with an infinite source are one of the sources that have been recently studied f...

متن کامل

Pisco: A Computational Approach to Predict Personality Types from Java Source Code

We developed an approach to automatically predict the personality traits of Java developers based on their source code for the PR-SOCO challenge 2016. The challenge provides a data set consisting of source code with their associated developers’ personality traits (neuroticism, extraversion, openness, agreeableness, and conscientiousness). Our approach adapts features from the authorship identif...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014